1 . Introduction

In in a country as vast as Australia, driving is an essential part of life. Driving here is not a luxury but a necessity. The number of individuals killed or wounded in road accidents in Australia was always a concern for which the government imposed a set of strict road safety rules and regulations. Due to such directives, the road accidents have seen quite a noticeable fall in the numbers of unforeseen contingencies. As a result, the following facts prompted me to learn more and investigate how the crashes have decreased over time. As a responsible resident, I also need to educated myself on road accident data and patterns as they develop and evolve. Therefore, I decided to conduct a study on the frequency of road crashes in order to learn more about the elements and places that contribute to these collisions.

2 . Data Description

The dataset has been downloaded from ATC Government Open Data Portal subcategorised as ACT Road Crash Data. With 71823 rows and 14 columns, the following data consists of both tabular and geographic details. It contains the latitude and longitude points of traffic incidents documented in the Australian Capital Territory from 2012 to 2021. The police or the general public have entered these into the “AFP Crash Report Form.” This dataset only includes crashes reported through the “AFP Crash Report Form”; any other accidents are not included (“ACT Road Crash Data”, 202, para.1).

3 . Data Wrangling

When it comes to data analysis, your results are only as good as your data (What Is Data Wrangling & Why Is It Necessary?, 2021). And to report accurate results, the data must me cleaned. Likewise, below is the table that shows the raw data which will be converted to a desired dataset for the analysis.

Column Name Description
CRASH_ID ID number of crash
CRASH_DATE Date of crash
CRASH_TIME Time of crash
SUBURB_LOCATION Suburb where the crash occurred
LONGITUDE Longitude coordinate of crash
LATITUDE Latitude coordinate of crash
INTERSECTION Whether the crash occurred in intersection
MIDBLOCK If the crash occurred was in mid-block
CRASH_DIRECTION Direction of travel at location of crash
CRASH_SEVERITY Level of crash severity
LIGHTING_CONDITION Lighting condition during the crash
ROAD_CONDITION Road condition during the crash
WEATHER_CONDITION Weather condition during the crash
Location Spatial values of the crash

Few columns from the original dataset were removed and underwent an some cleaning for a better and accurate analysis. For example, columns like Crash Date and Crash Time were broken down into different columns and their datatype was also changed.

3.1 Data Dimension

The graph above provides basic facts and figures about discrete columns, continuous columns, total number of null values, and missing observations. The data set has a large number of continuous columns, along with 38% of discrete variables present in the data.

3.2 Missing Values

The data is almost filled with values except it consists of a few missing details for column Suburb_Location. The column has 34 missing values.

3.3 Duplicate Values

The data is almost filled with values except it consists of a few missing details for column Suburb_Location. The column has 34 missing values.

4 . Data Exploration

4.1 . An examination of road crash severity contributing to car accidents in ACT.

4.1.1 . Accdidents timeline from 2012 to 2021.

The above chart indicates that during the earlier years, there is an increase and decrease in numbers but without much significant difference. But towards the recent years, the green lines indicate a decline in numbers. this is because of the pandemic in 2020 resulting in lesser traffic itself. (Write about lack of data which is why even 2021 is green).

In this series graph plot, we can observe that there is a trend of peaks during May or the rainy season. There is also a series of troughs that can be observed every January, because that is around the holiday season. There is a sudden drop in numbers during Jan 2020, due to the pandemic. This could be because of the lockdowns and a general drop in traffic on the road during those times. But in 2021, as the rules for the lockdown has relaxed and traffic increased on the roads, the number of accidents started to rise again.

4.1.2 . Accidents Severity

Analysis of Accident Severity and Weather Conditions

Weather Conditions(left) and Accident Severity (Right)

The Sankey plot explores the weather conditions and the relative consequences for road accidents.Fine weather seems to be the widest parameter and during this time, it resulted mostly in property damages.A large part of property Damages also occurred during rainy weather. Injuries mostly occurred during fine weather barring the few that occurred during rainy or snowy weather. But fatalities mostly occurred due to rainy weather light or heavy. From the previous graph it was evident that the accident trends indicate that rainy weather had the highest numbers.

4.2 . Identifying the most accident-prone suburbs/areas, as well as the factors/conditions that contributing to the collisions.

4.2.1 . Accident-prone Suburbs

4.2.2 . Factors that contribute to collisions

Most accidents occur in broad daylight. meaning that the external loghting conditions do not play a major role in the number of accidents.

4.3 . What time of the day affect the frequency of accidents and how they have changed in the years?

It is obvious that the road accidents during the day are higher during the office hours. Peaking at 8:00 AM, dropping a little during the rest of the day and peaking again at 5:00 PM and declining. The accidents are the lowest between the between 12:00 AM and 5:00 AM.

5 . Conclusion

6 . Reference

  1. Road Safety. Retrieved 7 August 2021, https://www.infrastructure.gov.au/roads/safety/

  2. Connelly, & Supangan. (2016, November 6). Science Science Direct. Retrieved March 20, 2022, https://www.sciencedirect.com/science/article/abs/pii/S0001457506000649

  3. ACT Road Crash Data. (2021). Retrieved 10 August 2021, from https://www.data.act.gov.au/Transport/ACT-Road-Crash-Data/6jn4-m8rx

  4. Car Accident Statistics 2020 | Car Research & Statistics — Budget DirectTM. (2020). Retrieved 7 August 2021, from https://www.budgetdirect.com.au/car-insurance/research/car-accident- statistics.html

  5. plot_intro function - RDocumentation. (2022). Retrieved 12 April 2022, from https://www.rdocumentation.org/packages/DataExplorer/versions/0.8.2/topics/plot_intro

  6. What Is Data Wrangling & Why Is It Necessary? (2021, May 14). MonkeyLearn Blog. https://monkeylearn.com/blog/data-wrangling/

  7. GGPLOT Point Shapes Best Tips - Datanovia. (2022). Retrieved 14 April 2022, from https://www.datanovia.com/en/blog/ggplot-point-shapes-best-tips/

  8. Holtz, Y. (2022). Interactive area chart with R and plotly. Retrieved 14 April 2022, from https://r-graph-gallery.com/163-interactive-area-chart-plotly.html